17 research outputs found
After the literature review and before the manuscript - supporting the middle of the research lifecycle with data, GIS, and digital scholarship services
Traditional library advisory services focus on the beginning and end of the research lifecycle, but what about students and researchers who need help with the steps in the middle? This talk will highlight UC San Diego\u27s public service support for data collection, management, analysis, and visualization. In particular, we will highlight our Data & GIS Lab; Digital Media Lab; Digital Scholarship services, including KNIT, our digital commons; Software/Data Carpentry and other workshops; and consultation services for students and researchers
Recommended from our members
Skills and Knowledge for Data-Intensive Environmental Research.
The scale and magnitude of complex and pressing environmental issues lend urgency to the need for integrative and reproducible analysis and synthesis, facilitated by data-intensive research approaches. However, the recent pace of technological change has been such that appropriate skills to accomplish data-intensive research are lacking among environmental scientists, who more than ever need greater access to training and mentorship in computational skills. Here, we provide a roadmap for raising data competencies of current and next-generation environmental researchers by describing the concepts and skills needed for effectively engaging with the heterogeneous, distributed, and rapidly growing volumes of available data. We articulate five key skills: (1) data management and processing, (2) analysis, (3) software skills for science, (4) visualization, and (5) communication methods for collaboration and dissemination. We provide an overview of the current suite of training initiatives available to environmental scientists and models for closing the skill-transfer gap
Ecology under lake ice
Winter conditions are rapidly changing in temperate ecosystems, particularly for those that experience periods of snow and ice cover. Relatively little is known of winter ecology in these systems, due to a historical research focus on summer ‘growing seasons’. We executed the first global quantitative synthesis on under‐ice lake ecology, including 36 abiotic and biotic variables from 42 research groups and 101 lakes, examining seasonal differences and connections as well as how seasonal differences vary with geophysical factors. Plankton were more abundant under ice than expected; mean winter values were 43.2% of summer values for chlorophyll a, 15.8% of summer phytoplankton biovolume and 25.3% of summer zooplankton density. Dissolved nitrogen concentrations were typically higher during winter, and these differences were exaggerated in smaller lakes. Lake size also influenced winter‐summer patterns for dissolved organic carbon (DOC), with higher winter DOC in smaller lakes. At coarse levels of taxonomic aggregation, phytoplankton and zooplankton community composition showed few systematic differences between seasons, although literature suggests that seasonal differences are frequently lake‐specific, species‐specific, or occur at the level of functional group. Within the subset of lakes that had longer time series, winter influenced the subsequent summer for some nutrient variables and zooplankton biomass
Recommended from our members
UC San Diego Ithaka S+R Research Study: Supporting Big Data Research
During Winter 2020 - Summer 2021, the UC San Diego Library participated in the Ithaka S+R multi-institutional research study “Supporting Big Data Research”. The purpose of this study was to learn more about how researchers on the UC San Diego campus work with big data in their research and provide a set of recommendations to enhance and develop resources and services that will directly support and benefit this research. While definitions of “big data” may vary by discipline, we use the term here to refer to datasets large enough to be challenging to analyze within a traditional spreadsheet, or on a single computer. In-depth interviews with twelve UC San Diego big data researchers of various academic ranks and departmental affiliations were conducted, with questions focusing on various aspects of collecting and analyzing big data, infrastructure needs, research communication and data sharing, and training and support needs. Based on these interviews, we identify three primary themes for big data researchers on the UC San Diego campus: Curation, Open Science, and Training. For each of these areas, we offer a set of recommendations for the campus to better support existing big data research at UC San Diego, as well as develop capacity to support future big data initiatives
Recommended from our members
Leveraging the Strengths of University of California Campus Communities to Reach More Learners Through Open Education
This presentation will discuss the development, implementation, and value of a cross-University of California campus workshop model that breaks down institutional silos and increases open education training for both instructors and learners. During the Fall 2020 Quarter, a team of instructors consisting of librarians, staff, and researchers from UC San Diego, UC Los Angeles, and UC Berkeley, planned and taught a virtual suite of foundational computational programming workshops over the course of three weeks to a diverse group of learners from all three UC campuses. This remote, distributed open educational workshop approach combined and leveraged the instructional strengths of each campus to reach significantly more learners while using less staff time than would be possible with individual, campus-focused workshops. This approach to online instruction also enabled the team of instructors to teach to a scale that maximized enrollment by allowing everyone waitlisted for the workshop to attend through expanded enrollment and concurrent sessions without sacrificing an instructor/helper-to-student ratio. From a pedagogical perspective, this model was particularly effective for teaching technical topics to a large group of novice learners, and given the success of this initial collaborative UC workshop, we are planning a second iteration of this workshop series for the beginning of the 2021 Fall quarter/semester that will include additional UC campuses. Ultimately, our goal is to turn this online collaborative workshop model into an effective and efficient instructional template that could potentially be used by not just the UC system, but by all institutions that may need to collaborate with others to meet the computational research training needs for their various communities
Recommended from our members
A model for data ethics instruction for non-experts
The dramatic increase in use of technological and algorithmic-based solutions for research, economic, and policy decisions has led to a number of high-profile ethical and privacy violations in the last decade. Current disparities in academic curriculum for data and computational science result in significant gaps regarding ethics training in the next generation of data-intensive researchers. Libraries are often called to fill the curricular gaps in data science training for non-data science disciplines, including within the University of California (UC) system. We found that in addition to incomplete computational training, ethics training is almost completely absent in the standard course curricula. In this report, we highlight the experiences of library data services providers in attempting to meet the need for additional training, by designing and running two workshops: Ethical Considerations in Data (2021) and its sequel Data Ethics & Justice (2022). We discuss our interdisciplinary workshop approach and our efforts to highlight resources that can be used by non-experts to engage productively with these topics. Finally, we report a set of recommendations for librarians and data science instructors to more easily incorporate data ethics concepts into curricular instruction
Recommended from our members
Surveying Machine Learning Objects Shared in Repositories to Inform Curation Practices
This presentation will provide an overview of our work to date on the 2021 LAUC Research Grant project, “A review of publishing and sharing practices for machine learning objects for informing library curation practices.” We will cover the motivation behind this project, our research methods, and preliminary findings from our work.Machine learning (ML) is a field of study that combines computer science and statistical techniques to achieve goals by “learning” iteratively through experience, and is becoming more common across a range of disciplines. Creating an ML research object is resource intensive, often requiring large amounts of training and test data and processing power. In addition, ML reproducibility depends on rigorous documentation but often falls short as a consequence of incomplete and/or poorly described components (e.g., training data, source code, algorithms), properties (parameters, methods, workflows, provenance), and computing environments (software packages and versions). Broad sharing of ML outputs, if properly documented and organized with an eye towards reusability, can therefore make future research more efficient and reproducible. To date, formalized guidelines and recommended practices for documenting and sharing ML objects are scarce, at least within library-centric professions and generalist data repositories. We seek to inform our practice by learning about current norms and standards for ML objects, if any, and to share knowledge gained with the broader data curation community.Our project includes a broad survey of ML objects on a selection of repositories that specialize in ML research workflows and outputs, as well as several generalist repositories. In conducting this broad scan of repositories for ML research objects and analyzing the provided metadata, we aim to identify “good” sharing practices as well as to assess whether the observed frequencies of these practices varies significantly among generalist repositories and across disciplines
Recommended from our members
Supporting Big Data Research: Recommendations from the Ithaka S+R Projects at UCB & UCSD
Across the disciplines, from physical sciences to social sciences to digital humanities, researchers are pursuing new and innovative research with big data and data science methodologies. In response, libraries and other campus units have begun to implement instruction, consultation, and curation services to support researchers in planning, managing, sharing, and publishing their data. However, many libraries have not yet fully assessed the needs of big data researchers, much less developed services specifically tailored for researchers producing and working with big data. In order to provide a better understanding of the state of the big data research landscape in higher education, the Ithaka S+R Project on “Supporting Big Data Research” brought together twenty-one U.S. institutions to conduct and analyze interviews with big data researchers on their campuses. Teams at UC Berkeley and UC San Diego each assembled rosters of interviewees across domains and ranks to examine practices, challenges, and trends related to data collection, data analysis, research communication, and training. Themes emerged around such issues as data storage, computing infrastructure, data and code sharing, openness, collaboration, and training needs. UCB and UCSD team members will present their common findings, important differences between the two campuses, and potential opportunities for cross-campus collaboration
Recommended from our members
The global lake area, climate, and population dataset
An increasing population in conjunction with a changing climate necessitates a detailed understanding of water abundance at multiple spatial and temporal scales. Remote sensing has provided massive data volumes to track fluctuations in water quantity, yet contextualizing water abundance with other local, regional, and global trends remains challenging by often requiring large computational resources to combine multiple data sources into analytically-friendly formats. To bridge this gap and facilitate future freshwater research opportunities, we harmonized existing global datasets to create the Global Lake area, Climate, and Population (GLCP) dataset. The GLCP is a compilation of lake surface area for 1.42 + million lakes and reservoirs of at least 10 ha in size from 1995 to 2015 with co-located basin-level temperature, precipitation, and population data. The GLCP was created with FAIR (findable, accessible, interoperable, reusable) data principles in mind and retains unique identifiers from parent datasets to expedite interoperability. The GLCP offers critical data for basic and applied investigations of lake surface area and water quantity at local, regional, and global scales